AITopics | non-local block

Collaborating Authors

non-local block

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

proposes a new approach (R1), and the idea of error-correction mechanism is intuitive (R1), novel (R2) and smart

Neural Information Processing SystemsAug-20-2025, 01:02:34 GMT

Is any special feature operation applied in ETN? & Does a larger K help? The motivation to compute affinity matrices & How to achieve the error diffusion. Please see Figure 1 in submission for example. Performance issues, including increased training burden and running time. Thanks for pointing out the mistake in real-time stylization, which will be corrected in revision.

error feature, error-correction mechanism, submission, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence (0.53)
Information Technology > Data Science > Data Quality > Data Cleaning (0.43)

Add feedback

cd10c7f376188a4a2ca3e8fea2c03aeb-AuthorFeedback.pdf

Neural Information Processing SystemsAug-16-2025, 12:39:42 GMT

artificial intelligence, autoregressive coefficient, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Super-Resolution Generative Adversarial Networks based Video Enhancement

Çetin, Kağan, Akça, Hacer, Gerek, Ömer Nezih

arXiv.org Artificial IntelligenceJul-1-2025

This study introduces an enhanced approach to video super-resolution by extending ordinary Single-Image Super-Resolution (SISR) Super-Resolution Generative Adversarial Network (SRGAN) structure to handle spatio-temporal data. While SRGAN has proven effective for single-image enhancement, its design does not account for the temporal continuity required in video processing. To address this, a modified framework that incorporates 3D Non-Local Blocks is proposed, which is enabling the model to capture relationships across both spatial and temporal dimensions. An experimental training pipeline is developed, based on patch-wise learning and advanced data degradation techniques, to simulate real-world video conditions and learn from both local and global structures and details. This helps the model generalize better and maintain stability across varying video content while maintaining the general structure besides the pixel-wise correctness. Two model variants--one larger and one more lightweight--are presented to explore the trade-offs between performance and efficiency. The results demonstrate improved temporal coherence, sharper textures, and fewer visual artifacts compared to traditional single-image methods. This work contributes to the development of practical, learning-based solutions for video enhancement tasks, with potential applications in streaming, gaming, and digital restoration.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.10589

Country: Asia (0.28)

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.46)

Industry:

Health & Medicine (0.46)
Information Technology (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

FENet: Focusing Enhanced Network for Lane Detection

Wang, Liman, Zhong, Hanyang

arXiv.org Artificial IntelligenceJan-29-2024

Inspired by human driving focus, this research pioneers networks augmented with Focusing Sampling, Partial Field of View Evaluation, Enhanced FPN architecture and Directional IoU Loss - targeted innovations addressing obstacles to precise lane detection for autonomous driving. Experiments demonstrate our Focusing Sampling strategy, emphasizing vital distant details unlike uniform approaches, significantly boosts both benchmark and practical curved/distant lane recognition accuracy essential for safety. While FENetV1 achieves state-of-the-art conventional metric performance via enhancements isolating perspective-aware contexts mimicking driver vision, FENetV2 proves most reliable on the proposed Partial Field analysis. Hence we specifically recommend V2 for practical lane navigation despite fractional degradation on standard entire-image measures. Future directions include collecting on-road data and integrating complementary dual frameworks to further breakthroughs guided by human perception principles. The Code is available at https://github.com/HanyangZhong/FENet.

detection, fenetv2, sampling, (17 more...)

arXiv.org Artificial Intelligence

2312.17163

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (0.34)
Automobiles & Trucks (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.48)

Add feedback

Old Photo Restoration using Deep Learning

#artificialintelligenceOct-4-2020, 20:26:47 GMT

As you can see in these images, there is a big difference between the synthesized old images and the real old ones. You can see that the synthesized image is already in high definition even with the fake scratches and color changes compared to the other one that contains way fewer details. They addressed this issue by creating their own new network specifically for the task. Basically, they used two variational auto-encoders, also called VAEs, to respectively transform old (degraded) and clean (restored) photos into two latent space. This translation into latent spaces is learned through synthetic paired data but is able to generalize well on real photos since this same domain gap is way smaller on such compact latent spaces. The domain gap from the two latent spaces produced by the VAEs is closed by jointly training an adversarial discriminator.

artificial intelligence, latent space, machine learning, (6 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

LRTD: Long-Range Temporal Dependency based Active Learning for Surgical Workflow Recognition

Shi, Xueying, Jin, Yueming, Dou, Qi, Heng, Pheng-Ann

arXiv.org Machine LearningApr-23-2020

Automatic surgical workflow recognition in video is an essentially fundamental yet challenging problem for developing computer-assisted and robotic-assisted surgery. Existing approaches with deep learning have achieved remarkable performance on analysis of surgical videos, however, heavily relying on large-scale labelled datasets. Unfortunately, the annotation is not often available in abundance, because it requires the domain knowledge of surgeons. In this paper, we propose a novel active learning method for cost-effective surgical video analysis. Specifically, we propose a non-local recurrent convolutional network (NL-RCNet), which introduces non-local block to capture the long-range temporal dependency (LRTD) among continuous frames. We then formulate an intra-clip dependency score to represent the overall dependency within this clip. By ranking scores among clips in unlabelled data pool, we select the clips with weak dependencies to annotate, which indicates the most informative ones to better benefit network training. We validate our approach on a large surgical video dataset (Cholec80) by performing surgical workflow recognition task. By using our LRTD based selection strategy, we can outperform other state-of-the-art active learning methods. Using only up to 50% of samples, our approach can exceed the performance of full-data training.

dependency, lrtd, non-local block, (10 more...)

arXiv.org Machine Learning

2004.09845

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (0.88)
Health & Medicine > Diagnostic Medicine > Imaging (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

NL-LinkNet: Toward Lighter but More Accurate Road Extraction with Non-Local Operations

Wang, Yooseung, Seo, Junghoon, Jeon, Taegyun

arXiv.org Machine LearningAug-22-2019

Road extraction from very high resolution satellite images is one of the most important topics in the field of remote sensing. For the road segmentation problem, spatial properties of the data can usually be captured using Convolutional Neural Networks. However, this approach only considers a few local neighborhoods at a time and has difficulty capturing long-range dependencies. In order to overcome the problem, we propose Non-Local LinkNet with non-local blocks that can grasp relations between global features. It enables each spatial feature point to refer to all other contextual information and results in more accurate road segmentation. In detail, our method achieved 65.00\% mIOU scores on the DeepGlobe 2018 Road Extraction Challenge dataset. Our best model outperformed D-LinkNet, 1st-ranked solution, by a significant gap of mIOU 0.88\% with much less number of parameters. We also present empirical analyses on proper usage of non-local blocks for the baseline model.

artificial intelligence, machine learning, non-local block, (18 more...)

arXiv.org Machine Learning

1908.08223

Country: Asia > South Korea (0.15)

Genre: Research Report (0.82)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

Cao, Yue, Xu, Jiarui, Lin, Stephen, Wei, Fangyun, Hu, Han

arXiv.org Artificial IntelligenceApr-25-2019

The Non-Local Network (NLNet) presents a pioneering approach for capturing long-range dependencies, via aggregating query-specific global context to each query position. However, through a rigorous empirical analysis, we have found that the global contexts modeled by non-local network are almost the same for different query positions within an image. In this paper, we take advantage of this finding to create a simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation. We further observe that this simplified design shares similar structure with Squeeze-Excitation Network (SENet). Hence we unify them into a three-step general framework for global context modeling. Within the general framework, we design a better instantiation, called the global context (GC) block, which is lightweight and can effectively model the global context. The lightweight property allows us to apply it for multiple layers in a backbone network to construct a global context network (GCNet), which generally outperforms both simplified NLNet and SENet on major benchmarks for various recognition tasks. The code and configurations are released at https://github.com/xvjiarui/GCNet.

artificial intelligence, machine learning, query position, (17 more...)

arXiv.org Artificial Intelligence

1904.11492

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Lung Nodule Classification using Deep Local-Global Networks

Al-Shabi, Mundher, Lan, Boon Leong, Chan, Wai Yee, Ng, Kwan-Hoong, Tan, Maxine

arXiv.org Artificial IntelligenceApr-22-2019

Purpose: Lung nodules have very diverse shapes and sizes, which makes classifying them as benign/malignant a challenging problem. In this paper, we propose a novel method to predict the malignancy of nodules that have the capability to analyze the shape and size of a nodule using a global feature extractor, as well as the density and structure of the nodule using a local feature extractor. Methods: We propose to use Residual Blocks with a 3x3 kernel size for local feature extraction, and Non-Local Blocks to extract the global features. The Non-Local Block has the ability to extract global features without using a huge number of parameters. The key idea behind the Non-Local Block is to apply matrix multiplications between features on the same feature maps. Results: We trained and validated the proposed method on the LIDC-IDRI dataset which contains 1,018 computed tomography (CT) scans. We followed a rigorous procedure for experimental setup namely, 10-fold cross-validation and ignored the nodules that had been annotated by less than 3 radiologists. The proposed method achieved state-of-the-art results with AUC=95.62%, while significantly outperforming other baseline methods. Conclusions: Our proposed Deep Local-Global network has the capability to accurately extract both local and global features. Our new method outperforms state-of-the-art architecture including Densenet and Resnet with transfer learning.

artificial intelligence, baseline method, machine learning, (13 more...)

arXiv.org Artificial Intelligence

1904.10126

Country:

Asia > Malaysia (0.30)
North America > United States > Oklahoma > Cleveland County > Norman (0.14)

Genre:

Research Report > Experimental Study (0.69)
Research Report > Promising Solution (0.49)
Research Report > New Finding (0.48)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multiple Sclerosis Lesion Inpainting Using Non-Local Partial Convolutions

Xiong, Hao, Tao, Dacheng

arXiv.org Machine LearningDec-24-2018

Multiple sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system (CNS) that results in focal injury to the grey and white matter. The presence of white matter lesions biases morphometric analyses such as registration, individual longitudinal measurements and tissue segmentation for brain volume measurements. Lesion-inpainting with intensities derived from surround healthy tissue represent one approach to alleviate such problems. However, existing methods inpaint lesions based on texture information derived from local surrounding tissue, often leading to inconsistent inpainting and the generation of artifacts such as intensity discrepancy and blurriness. Based on these observations, we propose non-local partial convolutions (NLPC) which integrates a Unet-like network with the non-local module. The non-local module is exploited to capture long range dependencies between the lesion area and remaining normal-appearing brain regions. Then, the lesion area is filled by referring to normal-appearing regions with more similar features. This method generates inpainted regions that appear more realistic and natural. Our quantitative experimental results also demonstrate superiority of this technique of existing state-of-the-art inpainting methods.

intensity, lesion, lesion mask, (13 more...)

arXiv.org Machine Learning

1901.00055

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
Europe > United Kingdom > England (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology > Multiple Sclerosis (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback